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1.  Introduction.  Let  X  ,  X  X  be  identically  and 

- — •  i  .  »  n 

Independently  distributed  randon  variables  (i.i.d.r.v.)  with 
distribution  function  F;  we  discuss  the  compos i te' goodness-oF-f i c 
test  given  by  the  test  of  Ho:  F  e  versus  the  alternative 

Ha:  F  e  F  “  F  ,  where  F  is  some  large  class  of  distributions  and 

0 

F  is  a  parametric  family  to  be  tested.  For  example,  F  may  be 
the  family  of  exponential  distributions  and  F  the  family  of 
continous  distributions. 


Many  test  procedures  are  based  on  characterizing  properties 

of  the  family  F  .  taking  the  form:  a  statistic  T(X  . X_;  F  ) 

a  0  ■  ■  ■  ■  \  n  0  • 

has  distribution  Q,  vihcre  Q  is  a  unique  distribution  function,  i  f 

and  only  i  f  F  e  F  ;  the  statistic  T  and  the  distribution  ti  nay  be 
,  ■  ,  11^1  0 

univariate  or  multivariate.  An  example  is  a  transformation  from 

X  X  to  a  statistic  T  *»  U  ,...Z  ),  (m  <  n),  where  the 

j  n  l  m  . 

statistics  2.  are  un i forms  i.e.  i  I  d  with  the  uniform  djstribution 
between  0  and  1,  v/hich  we  shall  write  u(0,l),  or  where  Z.  are  ordere 
uniforms ,  I.e.  distributed  like  a  random  sample  from  U(o.l)  which 
has  then  been  placed  In  ascending  order.  Examples  of  these 
transformations  are  the  Conditional  Probab i 1 i ty  Integra  1  Tranformat  I 
(CPIT)  discussed  in  O’Reilly  and  Quesenberry  (1973)  and  in  Rincon 
Gallardo,  Quesenberry  and  O’Reilly  (1979)  which  g i ve  uni  forms  and 
the  J  and  K  transformations  in  Seshadrf .  Csorgo  and  Stephens  (19^9) 
which  transform  exponential  random  variables  to  a  sample  of  ordered 
For  other  examples  of  characterizations  see  Prohorov  (19 
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uni  forms . 


I 


I 


and  Seshadriand  Csorgo  (1969).  In  particular*  the  final  step  in 
the  test  of  Ho  is  usually  to  calculate  a  test  statistic 

based  on  the  values  of  T,  and  compare  with  its  distribution 
under  Ho.  The  "if"  part  of  the  characterization  above  gives  what, 
is  usually  referred  to  as  the  distribution-free  property  of  T. 

If  the  "only  if"  part  fails*  meaning  that  itmight  have  distribution 

* 

Q  for  some  F  e  F  -Fo,  it  then  follov/s  that  a  test  of  Ho  based  on  T 
will  have  power  equal  to  the  significance  level  a  for  detecting  the 

A 

alternative  F*".  For  a  good  test  therefore,  a  characterization  is 
needed . 


I 


i 


Invariant  Characterizations.  Even  when  a  characterization  of  F 
. . . — — - -  0 

forms  the  basis  of  a  test,  the  values  in  T*  and  the  value  of  T 

1 

calculated  from  T,  may  depend  on  the  order  in  which  the  original 
X.  are  indexed.  When  this  is  not  the  case  v/e  say  that  the  charac 

I  ~ 

terizatlon  is  i nva riant;  then  all  statisticians  following  the  test 
procedure  will  obtain  the  same  value  of  the  test  statistic*  and  in 
general  this  has  an  intuitive  appeal. 

In  this  paper  we  concentrate  on  the  problem  of  testing 

for  the  exponential  distribution;  F  is  the  family  with  members 

0 

F  (x)  =  1  -  exp  (-x/0)  ,  X  >  0  *  with  0  >  0  unknown.  Two 

important  characterization  procedures  studied  by  Seshadri  Csorgo 
and  Stephens  (1969),  called  J  and  K  transformations,  are  based  on 
characterizations;  J  i.s  not,  invariant,  but  K  is,  and  these  authors 
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show  that  K  fs  superior  in  terms  of  power.  Thus,  more 

support  is  given  to  use  invariant  characterization  procedures,  at 

least  for  the  exponential  case. 

It  therefore  seems  worthwhile  to  develop  a  systematic  approach 
which  gives  an  Invariant  characterization.  This  Is  done  In  section 
3  by  means  of  the  CPIT.  The  details  are  worked  out  for  the 
exponential  test,  and  it  Is  shown  that  the  subsequent  characterization 
Is  connected  with  K  while  the  usual  CPIT  Is  connected  with  J. 

We  Investigate  these  procedures,  and  a  variant  of  K  ,  by  means  of 
power  studies.  They  extend  and  support  the  results  of  Seshadri , 

Csorgo  and  Stephens  (1969).  Thus  a  good  approach  to  providing 
structure  In  goodness -of-fi t  procedures  seems  to  be  the  search  for 
Invariant  characterizations.  Unfortunately,  the  general  application 
of  the  invariant  CPIT  will  be  difficult  computationally,  and  we 
conclude  the  paper  with  some  comments  on  these  problems. 


2.  Transformations  involving  characterization.  Throughout  the 
paper,  we  employ  the  following  notation.  The  r.v. 
will  denote  an  unordered  exponential  sample  vihereas  ^(j) 

X 


j  will  denote  the  corresponding  sample,  ordered  In  ascending 
order.  similarly  Z  ,  Z  ,...,Z  will  denote  an  unordered  U(0,1} 
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sample  and  will  stand  for  the  corresponding  ordered 

A  A  *  a  ^  ■a 

sample.  When  necessary,  X  ,  X  ,...,X^  and  Z  ,  Z  ,...,Z  will  also 
•  12  n  .1  2  n 

denote  .unordered  exponential  and  u(0*O  samples  respectively. 

The  following  transformations  which  are  characterizations 
In  a  sense  specified  in  each  case,  are  well  known  and  can  be  traced 
in  several  places  In  the  literature.  See  for  example,  Sukhatme 
(1373),  Galambos  (1975).  Ahsanullah  (1978)  and  Stephens  (1978). 

The  J  and  K  transf romat ions  are  those  used  by  Seshadri  ,  Csorgo  and 
Stephens  (1969).  We  give  the  transformations  in  symbolic  form 
where  the  meaning  is  obvious. 

The  J  transf  ormat ion  .  This  transforms  n  random  exponentials 
Into  o- 1  ordered  uniforms. 


. V  "E 

y  ^^{i)»^(2) . ^(n-i) 

wl  th 

^j)  -  ‘j.  ^ 

n 

(1  Xj)  .  j=l , . . .n-1 

V  _  « 

The  I.I.d.  nonnegative  r.v.  X  ,X  ,...,X  with  positive  mean  are 

12  •• 

exponentially  distributed  if  and  only  if  Z  j  ,Z  ,  .  .  .  ,Z  .j  j 
are  distributed  as  an  ordered  sample  of  size  (n-1)  from  U(0,1). 

The  H  transformation.-  (Sukhatme,  1937). This  changes  an  ordered 
exponential  sample  into  a  unordered  exponential  sample  with  new 
values. 


-  I,  _ 


wi  th 


id 


(lt(, •*  Q  -  (X*.x‘ . X*) 

X  J  »  (n+l-j)  (j-l))  » 

For  ^  ( j ) /j) » *  *  * »^ unordered  i.I.d.  nonnegatlye 
r.v.  X^,  X^,...,X^  with  positive  mean  are  exponentially  distributed 

A  ~  It 

If  and  only  if  X^  |X^,...(X^  are  i.I.d.  r.v.  exponentially  distrlbu 
ted. 


The  E  transformation.-  This  changes  an  ordered  uniform  sample  into 
an  unordered  uniform  sample  with  net-/  values. 


(Z 


(i) 


e 

Z 

2 


wi  th 


(Z 


(j) 


j  =  l  , . . . ,n 


U(0,1) 


For  Z j ,2 j , . . . ,Z j ,  the  unordered  i.i.d.  r.v. 

,Z  are  U(0,1)  If  and  only  if  Z*,Z*,...,Z*  are  i.i.d.  r.V. 


Another  two  transformat  ions  which  are  obvious  character I Zj^ 
tions  will  be  needed. 

«• 

The  I  transformation.-  This  takes  a  uniform  sample  into  a  new 
uniform  sample  by  substracting  each  value  from  i. 
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wl  th 


. V  •*[!>  . O 


The  R  transformation.-  This  takes  an  exponential  sample  to  a  new 
sample  by  reversing  the  indexing  of  the  original  values.  The  new 
sample  therefore  has  the  same  values  as  the  old. 


with  X.  “  ,  j=1^...»n 

The  K  transformation.-  The  K  transformation  is  equivalent  to  first 
using  N  and  then  using  J,  Thus  K,  like  J, transforms  into  n-1 
ordered  uniforms,  and  we  write  symbolically  K  =  J  o  N  . 


By  construction  K  is  an  invariant  characterization  whereas 
J  Is  only  a  character  izat  ion.  The  pov/er  studies  reported  by  Seshadri  , 
Csorgo  ^nd  Stephens  (1969)  for  a  wide  variety  of  uniformity  tests 
that  follow  the  transformation,  show  the  clear  superiority  of  K. 
Moreover,  It  is  also  noted  that  J  produces,  for  some  alternatives, 
transformed  values  that  are  more  evenly  spread  in  the  unit  interval 
than  if  they  were  uniformly  distributed;  these  are  called  superuni¬ 
forms  ,  and  arise  for  example,  when  J  maps  samples  from  a  half-normal 
distribution.  J  and  K  are  connected  with  a  un i forms-to-un i forms 
procedure  first  critically  examined  by  Durbin  (196I)  and  called  G 
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by  Seshadri  , Csorgo  and  Stephens  (1969).  Both  G  and  E  have  the 
property  that  they  transform  superuniforms  into  a  set  which  is 
more  clustered,  and  more  likely  to  be  detected  by  usual  tests  of 
uni  form?  ty. 


The  CPIT  transformation  C.-  In  O'Reilly  and  Quesenberry  (1973)  a 
transformation  is  proposed  that  produces  (n~l)  unordered  uniforms 
Z  ,2  I  from  n  unordered  exponentials  X  ,X  ,...,X  .  The  aim 

of  such  a  transformation  is  to  change  the  problem  of  testing 
exponent! al  1  ty  to  that  of  testing  uniformity.  In  that  paper, 
the  general  CPIT  procedure  is  given  and  is  illustrated  for  several 
families.  It  is  not  shown  that  the  procedure  yields  a  characteri¬ 
zation  in  general,  and  due  to  its  construction,  it  is  not  an 
invariant  procedure.  For  the  exponential  case,  the  CPIT  trans¬ 
formation  is  given  by 


wl  th 


. "E} 

Zj.,  .  .  -  (-Xj/ .  J.2 . n 


and  it  can  easily  be  seen  that  this  is  the  composition 


C  T  o  E  o  J.  Hence  for  the  exponential  case  C  yields 
a  character i zabi on  v/hich  Is  not  invariant. 
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The  Wang  and  Chang  transformation  V/>-  Recently,  Wang  and  Chang- 
(1977)  have  proposed  the  use  of  a  characterization  of  the  exponen 
tlal  distribution,  by  means  of  a  transformation  U  to  a  uniForm 
sample  given  by 

. V  *0*  . 

j  J  +  1  J 

with  2,  «  {  (  Z  X,)  /  (  Z  X.)  }  .  j=l,...,n-1 

J  i»l  ‘  i=l  ‘ 

It  is  easily  seen  that  V/  *»  E  o  J ,  and  again  is  a  charac^ 
terization  which  is  not  invariant  (Wang  and  Chang  (1977),  give  an 
Independent  proof  of  the  charac c er iza t t on) . 


3*  An  invariant  CPiT.  The  idea  behind  the  CPIT  procedure  is  to 
apply  the  multivariate  t  rans  forma  t  ion  due  to  Rosenblatt  (-1952)  to 
the  conditional  distribution  of  the  sample  given  the  minimal  suf¬ 
ficient  statistic  S  for  the  family  under  consideration.  in  that 
way  a  set  of  independent  uniforms  is  obtained.  Since  Rosenblatt's 
transformation  requires  absolute  continuity  of  the  multivariate 
distribution  and  since  the  conditional  distribution  of  the  v/hole 
sample  given  S  is  necessarily  singular,  one  is  forced  to  seek  the 
maximum  number  of  terms  in  the  sample  for  v/hich  their  conditional 
distribution  given  S  is  absolutely  continuous  (almost  surely).  I*' 
that  way  one  gets  the  maximum  number  of  uniforms. 


8 


If  in  this  procedure,  instead  of  considering  the  conditional 
distribution  of  sample)  given  S,  we  actually  consider 

the  conditional  distribution  of  )  *^(2  )  » •  •  •  then  the 

resulting  transformation  will  be  invariant.  This  procedure  will  be 
referred  to  as  the  invariant  CPIT  (ICPIT). 


In  what  follows  the  ICPIT  is  worked  out  for  the  exponential 
case.  Many  of  the  attractive  properties  that  are  found  In  dealing  with 
this  case  are  due  to  the  features  of  the  exponential  distri¬ 
bution.  Nevertheless,  the  procedure  outlined  could  prove  useful  in 
other  cases  as  v/ell,  and  other  families  are  currently  being  studied. 


In  order  to  apply  Rosenblatt's  transformation  to  the  condi¬ 
tional  distribution  of  as  many  order  statistics  as  possible  v;hile 
retaining  absolute  continuity,  it  is  proposed  that  this  maximum  number 
should  be  n-1,  just  as  in  the  ordinary  CPIT. 


Let  S  stand  for 
n 

for  Z  X,.w  etc.  S 
1=3 


(2) 
stat i St i c. 


n  /X  n 

Z  X.,  and  also  let  stand  for  Z  X.,%, 

1=1  '  1=2 

is  in  this  case  the  minimal  sufficient 


Suppose  that  the  conditional  distribution  of  X ^ ^ ^ , X ^ ^ ^ .  ,X 
evaluated  at  ^  (  ,  ) (j )  ’  ' *  * )  g»ven  S  is  absolutely  continuous 
a.s.,  and  denote  it  by  ^  ^  j  » •  •  •  »x  I  j  )  • 
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Applying  Rosenblatt's  transformation  in  this  conditional 
setting,  reqyircs  the  computation  of  the  marginal  and  conditional 


distributions,  Fn(xj^j),  Fn  (x  j  |x  j  ^  j  Fn  (x 


(n-l)l’‘(i)* - ^(n-2)' 


which  are  afterwards  used  to  produce  the  independent  U(0,l)  trans- 

i 

) 


(n-2) 


If  all  of  the  conditional  distributions  and  the  marginal 
distribution  employed  arc  absolutely  continuous  a.s.,  then  the  Joint 
f n (x ) , . . . ,x (^_ ^ ) )  is  absolutely  continuous  a.s.  and  viceversa,  so 
It  will  suffice  to  verify  the  absolute  continuity  a.s.,  of  the  (n-l) 
proposed  distributions. 


Fn(x(_))  may  be  computed  directly  or  by  means  of  Bayes', 
formula  since  the  conditional  distribution  of  X,  .  given  S  can  be 

(i)  ^ 

expressed  In  terms  of  the  conditional  distribution  of  S  given  X/  % 

Vi  ) 

and  the  distribution  of  S,  and  these  are  v/ell  known.  After  doing 
the  algebra  we  have 


0 

vl 

1 


If  <  0 

(l-nx(j/S)"’^  If 
If  X(^,>  S 


*(l)  ^  (0*^5 
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thus  1-  (l-nX^^j/S)”  *,  The  next  step  is  to  obtain 

Fn  (x  (^)  jx  (^  )  )  ,  which  is  ^^(2)  evaluated  at 

Note  that  knowledge  of  S  and  X^  ^  is  equivalent  to 
knowledge  of  and  thus  ^  )  I’*  ( i  )  ^  ** 

PfX^^)  ^  ^  (2  )  I  ^  ^  ^  (  j  )  ^  evaluated  at  *• 


Given  other  observations  ^(2)»^(  )*****^(n)  **'* 

distributed  as  an  ordered  sample  of  size  (n**!)  from  an  exponential 
with  origin  thus  one  can  obtain  easily 


(1) 


the  above  const deratl on  yields 


FnU,^,lx,^,). 


l-I  1-(i.-l)(x,^j-x,^,V(S*>’-Cn-l)X(,))) 

If  ^  (0,S*'*-(li-l)Xj^j) 


n-2 


1  elsewhere; 


thus,  Z^-  1-C  l-(n-l)  (X^^j-X^  J/{S^^^-(n-l)X^^j)] 


n-2 
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For  the  computatfon  of  )  I ^ .  the  Markovian 

property  of  the  order  statistics  and  the  fact  that  knowledge  of 
S,X^^j  and  is  equivalent  to  that  of  and  yield 

the  following  result; 


evaluated  at  But  X j<  x^^ j I S .X 

Is  equal  to  X j<  x^ ^ j | ^ ,X j ,X^^ j] ,  and  given  X^^j,  X^^j  Is 
independent  of  X^^j  and  therefore 


(2) 


Pi  X  ^  ^  X  (  ^  )  I  S  ^  ,  X^^^,X^_^1  =  Pi  X  ^  X  f  ^  I  S  *  a.s. 


(l)’'^(2) 


(3)  *(3) 


(2) 


(2) 


Now,  In  order  to  compute  this  last  conditional  d  i  s  t  r  i  bujt  i  on 

v/e  observe  that  given  X,  »,  the  set  X/  \,X,  .,...,X/  »  is  distributed 

wJ  V3/  v«./  In; 

as  an  ordered  sample  of  size  (n-2)  from  an  exponential  distributio 
with  origin  X^^j  so  the  previous  approach  is  repeated. 


By  extending  these  results,  we  have  the  following 


Theorem  for  j=l ,2, . . . ,n-l ,  the  r.v. 


Z 


J 


^  1- (n-Jt))X(j.,, 4 (j  )*••  •**(„))  J 


where  X^^j  s  0 
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arc  l.l.d.  U(0,1)  If  » •  •  •  are  ordered  exponentials. 


It  can  now  be  seen  that  the  ICPIT  can  be  written  as  the 
composition  of  previous  characterizations:  ICPIT  -T  oEoJoRoN. 

Comment.-  Recall  that  K«JoN,  V.-EoJ  and  C  •  T  o  E  o  J»  and 
define  also  H  =  J  o  R  o  H.  If  In  subsequent  tests  for  uniformity 
we  use  test  statistics  which  give  the  same  value  for  a  sec  Z.  as  for 
the  corresponding  set  1  -  Zj,  It  is  clear  that  C  is  equivalent  to 
W  “  E  o  J  (  will  give  the  same  values  of  the  test  statistics) 

and  ICPIT  is  equivalent  to  E  o  J  o  R  o  N.  The  test  statistics  used 
belovi,  based  on  the  empirical  distribution  function,  have  this 
property.  Note  also  that  since  R  simply  indexes  exponentials  In 
reverse  order,  and  does  not  give  new  values,  me  might  expect  K  and  M  to 
have  the  same  power  properties.  The  step  E  which  appears  in  the  C 
composition  will  have  the  property  that  it  takes  superuniforms, 
which  can  sometimes  be  produced  by  J,  into  samples  which  would  be 
declared  significant  using  the  usual  (upper)  tail  of  the  test 
statistics.  Thus  we  do  not  have  to  guard  against  the  possibility 
of  superuniforms  as  Seshadri ,  Csorgo  and  Stephens  (1369)  found 
necessary  with  J  alone.  On  the  other  hand,  E  might  sometimes  take 
non-uniform  samples  Into  more  evenly  spaced  observations,  thus 
weakening  the  test,  we  might  find  ICPIT,  for  some  alterna¬ 

tives,  giving  a  sample  which  is  not  so  easily  detected  for  non¬ 
uniforms  as  H,  or  its  equivalent  K. 
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hm  Power  Studies..  These  points  have  been  examined  by  extensive 
power  studies,  using  C,  ICPIT  and  M.  After  the  samples  were 
taken  from  the  alternatives  listed,  the  statistics  used  for  testing 
uniformity  were  the  Cramer-Von  Mises  W* ,  Vatson  U*,  Kolmogorov> 
Smirnov  D,  Kuiper  V  and  Anderson-Darling  A^ .  Tables  1  and  Z  give  result 
one  for  sample  size  n=l6  and  the  second  for  n«20,  both  with  tests 
of  size  a’*0.10.  These  should  be  compared  for  J  and  K  given  by 
Seshadri , ^Csorgo  and  Stephens  (l969)>  and  more  extensive  tables 
for  K  alone  given  by  Stephens  (1978). 


Comments . -  (a).-  The  presence  of  E  after  J  in  C  does  give  upper  tall 
significance  for  the  samples  from  the  half  normal  distribution  where 
J  alone  gives  superuniforms,  as  v/as  conjectured  above.  C  is  still 
not  as  powerful  in  this  case  as  J,  using  the  lower  tail  of  the  test 
statistic,  but  it  must  be  emphasized  that  one  would  not  know  that 
the  lower  tail  is  needed,  so  that  C  is  to  be  preferred  to  J.  (!>)  ” 

As  conjectured,  H  gives  results  very  close  to  those  given  in 
Stephens  (1978)  for  K  for  a  wide  range  of  alternatives.  (c)  -  In 
general  M  (or  K)  give  results  a  little  better  than  ICPIT,  which 
overall  Is  better  than  C;  in  other  words  the  tranforma t  Ion  related 
to  ICPIT  which  gives  ordered  uniforms  is  preferred  to  ICPIT  which 
gives  unordcred  uniforms.  (d).-  K  Is  therefore  justified  again  as 
an  effective  chdracterization,  and  the  ICPIT,  which  reproduced-  1C  ,  Is 
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shown  to  provide  a  systematic  approach  to  invariant  characterizations. 
Unfortunately,  for  most  families,  the  ICPIT  will  be  difficult 
computationally;  we  hope  that  this  work  will  stimulate  further 
research  into  the  general  problem  of  finding  invariant  characterizations. 
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Half  cfuchy 


TABLE  2.-  Simulated  power  In  %  for  procedures  C,  ICPIT  and  M  followed 

by  W*,  U  *,  K,  D  and  A*.  Sample  size  n-20;  significance  level 
a  ■  *10;  500  samples  from  each  distribution. 
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CHARACTERIZATIONS  AND  GOODNESS  OF  FIT  TESTS 

In  this  article  a  systematic  approach  to  providing 
goodness  of  fit  tests  is  discussed,  for  the  composite  goodness 
of  fit  problem  of  testing  that  the  distribution  F  of  a  random 
sample  comes  from  a  parametric  family  Characterization 

procedures  are  emphasized,  and  it  is  shown  that,  at  least  for 
the  exponential  case,  invariant  characterizations  appear  to  be 
better  than  those  which  are  not  invariant.  A  general  technique 
is  developed  for  producing  invariant  characterizations  and  for 
the  exponential  case  it  is  shown  how  these  are  related  to 
characterizations  already  in  the  literature.  Power  studies 
are  given  to  examine  the  tests  based  on  both  invariant  and 
non-invariant  characterizations. 
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